Adaptive Learning Framework for the Biped Robot

Robotics And Machine ANalytics Laboratory (RAMAN LAB) Malaviya National Institute of Technology Jaipur

About us Research Team Research Publication Projects Awards Activities Blogs Contact Us

Adaptive Learning Framework for the Biped Robot

A biped robot is a type of walking robot which mimics the locomotion of a human being. It can be employed in irregular structures which are mostly found in real-time. Mostly, it found applications that are related to disaster management relief, muddy places, nuclear power plants, structures build for humans, etc. However, controlling biped robots is tougher than humans because of the structure. Since humans can vary the center of pressure (CoP) for stable walking with the help of the upper body, hands, and leg muscles. Whereas the biped robot consists of the lower part that includes the joint motors only for varying the CoP.

The kinematic structure and dynamics of a biped robot are very complex due to multiple joint and floating base. Therefore, it is a big challenge to produce the closed-loop trajectories for multiple tasks such as walking on flat and uneven terrain, dancing, logistic support in the industry, etc. Though, researchers in past have developed various methodologies for the generation of closed-loop trajectories. However, the biped robot failed to adapt to new situations that it encountered in the real world, which can be hazardous. Therefore, there is a requirement of the learning framework which can learn in real-time. Additionally, the framework should be data-efficient as well. Fig. 1 presents the pictorial representation of the adaptive learning framework.

Figure 1: Pictorial Representation of Adaptive Learning Framework

The major components of Framework are:

State Estimator: It is used for the estimation of the various parameters of a biped robot such as position, velocity, acceleration, orientation, force, torques, etc. The most prominent technique is recursive Bayesian filtering. However, the designing of these filters follows a complex mathematical calculation, making it a difficult problem. Therefore, data-based learning filters can solve this hurdle and can be developed using the Learning techniques such as Gaussian process regressor, deep learning approaches, hybrid approaches, etc.

Policy Structure: It is the most important aspect of the framework. Since policy works as the brain of the robot. It takes the important decision based on the real-time situation. Additionally, it should have the capability of adaptiveness to the new situation that the robot encounters. Traditional policies take a lot of iterations to learn the behavior, which makes them infeasible to directly deploy in an unknown real-time arena. This is due to the tuning of billions of parameters in the policy, which takes a lot of iterations. Therefore, the policy structure should be modified such that, after its deployment, it depends only on a few parameters to change its behavior in real-time.

Training Algorithm: It is the main component to learning the policy using the states and reward. The major aspect of the algorithm is that it must be able to guide the learning process such that the process requires fewer data samples.

The presented framework has the potential to become the first step for artificial generalized intelligence in the biped robot. However, the framework has a lot of challenges: (a) structure of the reward function in real-time, (b) how to incorporate the additional parameters for the regulation of the policy structure and, (c) efficient training algorithm.

MARCH 13, 2022 BHARAT SINGH